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In this paper, we characterize the capacity of a new class of single-source multicast discrete 

oo ■ 

memoryless relay networks having a tree topology in which the root node is the source and each parent 
node in the graph has at most one noisy child node and any number of noiseless child nodes. This 
class of multicast tree networks includes the class of diamond networks studied by Kang and Ulukus 
as a special case, where they showed that the capacity can be strictly lower than the cut-set bound. For 
achievablity, a novel coding scheme is constructed where each noisy relay employs a combination of 
decode-and-forward (DF) and compress-and-forward (CF) and each noiseless relay performs a random 
binning such that codebook constructions and relay operations are independent for each node and do not 

C^^ ' depend on the network topology. For converse, a new technique of iteratively manipulating inequalities 

. _■ I exploiting the tree topology is used. 
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I. Introduction 

In this paper, we consider a single-source multicast discrete memoryless relay network in which 
the source wants to send the same message reliably to multiple destinations with the help of one or 
more relays. A model of relay networks was introduced by van der Meulen in [[T|, [|2l. However, 

The material in this paper was presented in part at the Information Theory and Applications Workshop, UCSD, San Diego, 
CA, USA, January /February 2010, at the IEEE International Symposium on Information Theory, Austin, TX, USA, June 2010, 
and at the Allerton Conference on Communication, Control, and Computing, Monticello, IL, USA, Sep. 2010. 
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the single-letter capacity characterization has been open even for three-node relay networks, i.e., 
relay networks having a source, a relay, and a destination. In their seminal paper [[3]|, Cover 
and El Gamal developed two fundamental coding strategies for three-node relay networks. One 
of them is decode-and-forward (DF), where the relay decodes the message and forwards it to 
the destination, which was shown to be optimal for physically degraded channels [3J. DF was 
generalized for multiple relays in H, [[5]|. In another strategy, compress-and-forward (CF), the 
relay compresses its received block and sends the compressed information to the destination. 
CF was shown to achieve the capacity for some classes of relay networks [HI, 0. Recently, 
CF was generalized to noisy network coding in ^ for multiple relays, which includes many 
previous results on relay networks ^, [|9l|-[[TT]| as special cases. A potentially better strategy is 
to decode as much as possible and compress the residual information, i.e., a combination of DF 
and CF |[3l. Indeed such a strategy was shown to be optimal by Kang and Ulukus for a certain 
class of diamond networks in [[T2||. which consists of a source, a noisy relay, a noiseless relay 
that receives exactly what the source sends, and a destination that has orthogonal finite-capacity 
links from relays. For this class of diamond networks, it was shown that a combination of DF 
and CF at the noisy relay is optimal and the cut-set bound is in general loose fT2^ . 

In this paper, we show the optimality of a combination of DF and CF for a new class of 
single-source multicast relay networks with an arbitrary number of nodes, which includes the 
class of diamond networks in [[T2| as a special case. In this class, which we call multicast tree 
networks, a network has a tree topology in which the root node is the source and each parent 
node in the graph has at most one noisy child node and any number of noiseless child nodes. 
We note that the achievability and converse for diamond networks in [[T2| cannot be directly 
generalized to those for our multicast tree networks. First, the codebook constructions and relay 
operations of the coding scheme in [iTl for diamond networks, which has a single destination, 
vary according to the link capacities from relays to the destination. This cannot be used for 
multicast tree networks since they have arbitrarily many destinations. Next, it would not be 
easy to generalize the converse proof technique in [[T2l| for diamond networks, which have only 
four nodes in three levels, for our multicast tree networks, which have arbitrarily many nodes 
in arbitrarily high levels. Therefore, for these two reasons, we need new techniques. The key 
technical contributions in the achievability and converse in this paper are as follows: 

• Achievability: For the generalization to multicast tree networks, we construct a robust coding 
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scheme where codebook constructions and relay operations are independent for each node 

and do not depend on the network topology. Such a robustness of the coding scheme makes 

the generalization from a single destination to multiple destinations possible. 

• Converse: To get a very simple min-cut expression, we use a novel technique of iteratively 

manipulating inequalities, i.e., we recursively reduce a number of inequalities into one using 

the tree topology. 

The organization of this paper is as follows. The model of a class of multicast tree networks 

is presented in Section HIl In Section Unl we present lower and upper bounds on the capacity of 

the class of multicast tree networks and show a condition for these two bounds to coincide. In 

Section |IVl we derive the lower bound by presenting a coding scheme where each noisy relay 

employs a combination of DF and CF and each noiseless relay performs a random binning. In 

Section |Vl the upper bound is shown using a recursion exploiting the tree topology. In Section IVll 

we present an equivalent capacity expression for diamond networks that shows that without loss 

of optimality we can construct the coding scheme such that what is compressed after decoding 

at a noisy relay is a noisy observation of almost uncoded information. The conclusion of this 

paper is given in Section IVIIi 

The following notations will be used in the paper. For two integers i and j, [i : j] denotes 
the set {i,i + 1, . . . , j}, x] denotes a row vector (xi,Xi+i, ....,Xj), and x^ denotes x{. xs for 
a set S denotes a row vector {xi : i E S). According to the context, k sometimes denotes the 
single-element set {k} for notational convenience. 

In this paper, we follow the notion of e-robustly typical sequence introduced in lfT3l . Let 
Nxn{x) denote the number of occurrences of x G A" in the sequence x". Then, x" is said to be 

e-robustly typical (or just typical) for e > if for every x E X, 

N^u{x) 



— p{x) 



< ep{x). 



n 
The set of all e-robustly typical x" is denoted as T^{X), which is shortly denoted as T^. Similarly, 

let Nr^n^yn[x,y) denote the number of occurrences of {x,y) E X x y in the sequence (x",?/"). 

The sequence (x", y") is said to be e-robustly typical (or just typical) if 



p{x,y) 



< ep{x, y) 



n 
for every {x,y) E X x y. The set of all e-robustly typical (x",?/") is denoted by Te(X, F) or 

Te in short. 
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II. Model 
A single-source multicast discrete memoryless relay network of N nodes 

{Xi X ... X XN,p{yi, ...,yN\xi, ...,XN),yi x ... x J^at) 

consists of alphabets Xk, yk for A; G [1 : A^] and a collection of conditional probability mass 
functions p {yi, ..., ?/Ar|xi, ..., xat) where Xk G Xk and yk G yt for k e [1 : N]. Let K denote the 
number of destinations. Let 1 and Dd denote the source and the set of nodes that forms the d-th 
destination, respectively, and let J^i = X^,^ = for d G [1 : K]. We note that D^ for d E [I : K] 
axe not necessarily disjoint. Let D = IJdefiKi ^d- 

A (2"^, nj code for a single-source multicast discrete memoryless relay network of N nodes 
consists of a message set Wi = [1 : 2"^], a source encoder that assigns a codeword x^(wi) to 
each message Wi G Wi, a set of relay encoders, where encoder A; G [2 : N]\D assigns a symbol 
Xk,i{y]r^) to every received sequence y]r^ for z G [1 : n], and a set of decoders, where decoder 
A; G [1 : -ft'] assigns an estimate wi^k to each received sequence y'^ . The message Wi is chosen 
uniformly from the set Wi. The average probability of error for a (2"^,?t,) code is given as 

pj") 4 p jiy^^ ^ Wi for some rf G [1 : iT] } . 

A rate R is said to be achievable if there exists a sequence of (2"^, n) codes such that Pe — ?■ 
as n — !> oo. The capacity is the supremum of all achievable rates. 

A single-source multicast discrete memoryless relay network is called a multicast tree network 
if the probability distribution has the form of 

p(2/i,---,2/Jvki,...,X7v) = JJ p(yfc|xpj 

k£[l:N] 

where pk is called the parent node of node k and k is called a child node of node pk- A child 
node is considered to be one level lower than its parent node. A node without a parent node 
is called the root node and a node that has no child node is called a leaf node. Let Lk for 
k E [1 : N] denote the set of leaf nodes that branches out from node k. For tree T, let Tk for 
k e\1 : N] denote the subtree of T that consists of node k and all of its descendants in T. 

In this paper, our goal is to present lower and upper bounds on the capacity of a class of 
multicast tree networks and to find some tightness conditions of those two bounds. In this class 
of multicast tree networks, the source node is the root node, Dd C Li for d E [1 : K], and each 
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Fig. 1. An example of our multicast tree networks. The solid and dashed lines represent noiseless and noisy links, respectively. 
In this example, the parent node of node 3 is node 1 and the child nodes of node 3 are nodes 7 and 8. Node 1 is the root node 
and nodes 5, 9, 10, 11, 12, 13, and 14 are the leaf nodes. A destination is a subset of leaf nodes. For instance, destination 1 is 
the set of nodes 5, 11, 12, and 13, destination 2 is the set of nodes 9, 12, and 14, and destination 3 is node 10. L2 is the set 
of nodes 5, 9, 10, and 11. Ta is the subtree that consists of nodes 3, 7, 8, 12, 13, and 14. 



parent node has at most one noisy child node and any number of noiseless child nodes, i.e., 
Vk = '^pk if ^ is a noiseless child node of node pk- Without loss of generality, we assume that 
D = Li. Let Gd = {k\Lk n L>d ^ 0} for d E [1 : K]. Let n^ and Mk for k E [I : N] denote 
the noisy child node and the set of noiseless child nodes of node k, respectively. Let Zk for 
k E [I : N] denote the set of child nodes of node k, i.e., Z^ = rikU M^. From now on, we only 
consider this class of multicast tree networks. See Fig. [B 

A practical example of our multicast tree networks is depicted in Fig. [2l which represents 
a sensor network where a sensor node wants to send a message to the gateway nodes at the 
boundary connected with infinite-capacity wired links. In this example, each relay node has 
outgoing links to its neighbor relays such that one of the links is arbitrarily noisy and the others 
are noiseless. Motivation for assuming noiseless links comes from a practical scenario where a 
transmitter is using a fixed modulation scheme tuned for the worst link and thus the transmission 
from the transmitter to the other receivers with better channel qualities looks almost noiseless. 

in. Main Results for Multicast Tree Networks 
Let us present lower and upper bounds on the capacity of multicast tree networks. 
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Fig. 2. A sensor network in which a sensor node wants to send a message to gateway nodes at the boundary connected with 
infinite-capacity wired links. The sohd and dashed lines represent noiseless and noisy links, respectively, and thick lines at the 
boundary represent infinite-capacity wired links. 



Theorem 1: The capacity C of multicast tree networks is lower- and upper-bounded as 



C> 



max 



TlkGli:N]Pi^k,Xk)p{ynJuk,y„^)d(^[l:K] Sa ^ 



niin min ^ I{Uk;Yn,) + H{Xk\Uk) 



Sd.d 



+ Y, i{Uk\Y^,) + i{Xk-X,\Uk)- Yl Hyn,;YnJUk,x, 



(1) 



fcessrf.d 



fc6Cs„d 



C< max , min niax min V /(f/^; r„J + if(Xfc|[/fc) 

nk€[l:N]Pi-^k,Xk)de[l:K]Uke[l:N]P(y»k\'^k,yn^) Sd ^ 

+ Y. nuk;Yr,j+i{XkX,\Uk)- Y nyn,-X,\Uk,Xk) (2) 

over all cuts Sd C Gd such that 1 G Sd, Dd C S*^, Mfc n G^ C Sd if Uk G Sd, and pk G 5'^ if 
k E Sd with cardinalities of alphabets such that 

\Kk\<m+4: (3a) 

iXj < \mynj + 2< \Xk\\ynj+Mynj + '2 ob) 

for k E [1 : N]. Here, As^^d, Bs^.d^ and Csd,d for d E [1 : K] denote the following disjoint 
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TABLE I 

Classification of fc e 5*^ into As^.d, Bs^,d, and Csj^a 



^^---^^ 


uk n Gd 


uk n Gd / 


nfc n Gd = 


MkHGd ~^~~~-----,^ 


rifc n Gd C Sd 


nfe n Gd C S2 


Mk n Gd / 


Mk nGdC Sd 


k G Gs^,d 


k e Bs^,d 


k G Gs„d 


M, n Gd C S2 


- 


fe G ^Sd,d 


fc G As^.d 


Mk n Gd = 


k e Csa,d 


fc G -Bsrf.d 


- 



indicates tliat corresponding cases do not happen tor a cut ISd ot interest. 



subsets of Sri. 



As,4 = {k\ke Sd, Zk C 5^, M^nGd^ 0} 

Bsd,d = {k\ke Sd, rik E S'a, MkDGdC S^, nkHGd^ 0} 

Cs,,d = {k\keSd,z,nGdCSd} 

See Table H 

Remark 1: In Theorem [H a cut Sd of interest for destination d E [I : K] satisfies that p^ E Sd 
if k E Sd and Mk DGd C Sd if rik E Sd in addition to that 1 E Sd and Dd '^ S^. This additional 
condition signifies that node pk can decode whatever node k can and a node in Mk can decode 
whatever node Uk can. 

We can see that the lower and upper bounds in Theorem \T\ meet when the maximizing 
distribution of Ylkeii-mPiynkl'^k^yrn:) is independent of destinations. The following corollary 
presents a class of such multicast tree networks. Let ad for d E [I : K] denote the node at the 
lowest level in the set {k\Dd C Lk}. The proof is in Appendix lAl 

Corollary 1: If L^- CiDj =0 for all i,j E [1 : K] such that i ^ j, the lower and upper bounds 
in Theorem \T\ coincide. 

Corollary \T\ says that the lower and upper bounds meet when each set of nodes forming a 
destination is included in a disjoint subtree. For example, the lower and upper bounds for the 
multicast tree network represented in Fig. [T] meet when destination 1 is the set of nodes 5, 9, 
10, and 11, destination 2 is the set of nodes 12 and 13, destination 3 is node 14. 

For the single destination case, the lower and upper bounds in Theorem [T] coincide trivially. 
In this case, the following corollary gives a simpler capacity expression. 
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Corollary 2: For tree networks with a single destination, the capacity is given as 

maxmm/(t/s; Ys^ \ Xs) + /(X5; Ysc\Us) - I{Ys; Ys\Us, Xs) (4) 

where the minimization is over all cuts 5 C [1 : A^] such that I E S, D C S^, M^ C S* if 
rifc G S, and Pk E S if k E S, and the maximization is over the joint distribution of 

Yl PiUk,Xk)p{ynJUk,ynJ (5) 

k€[l:N] 

with cardinalities of alphabets satisfying Q for A; G [1 : iV]. In ©, Yj = X^ for A; G [1 : iV] 
and j G Mk and Ys<^ \ Xs denotes the set 

{r,|JG5^J^MfcforallA:G5}. 

Proof: For a cut S of interest, we have 

I{Us-,Ys\Xs)= Yl ^(Uu-^^n,) 
fceAs,iUBs,i 

I{Xs-.Ysc\Us)= Y. I{Xk;XuX,\Uk)+ J2 HXkXjUk) 

k£As,i k<=Bs,i 

= Y H{Xk\Uk)+ Y HXk;Y^jUk) 

keAs,i kGBs,i 

IiYs;Ys\Us,Xs)= ^ J(K„,; y;jf/fc,Xfc) 

k€Cs,i 

from the joint distribution ([5]), which concludes the proof. ■ 

Here U corresponds to the part of a message intended to be decoded by a noisy relay and Y 
corresponds the compressed version of a received block. 

In contrast, only CF is performed at relays in noisy network coding |[8l, whose achievable 
rate for general single-source single-destination discrete memoryless relay networks is given as 

maxmin/(X5; Ys^, Yd\Xs^, Q) - I{Ys; YslX"" , Ys^, Yd, Q) (6) 

where the minimization is over all cuts S* C [1 : A^] such that 1 E S and D C S'^ and the 
maximization is over the joint distribution of 

P{(l) n Pi^k\q)p{yk\xk,yk,q)- 

keli-.N] 

Note that dH) and Q are somewhat similar especially the parts involving F's but dH) includes 
f/'s due to DF 
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IV. ACHIEVABILITY 

Fix a joint distribution of ([5]). Fix e" > e' > and fix r^a > 0,rfcfe > 0, and r^^.^^, > for 
ke[l:N]\D. 

1) Codebook generation: For k E [2 : A^], the index set W/t of node k is defined as 

{fl : 2"^f'fe-l X [1 : 2"'''='H for k = n„, 
[ J L J P. _ 

[1 : 2"^''fe-"] X [1 : 2"''''fc-''] for /t G Mp, 

For fc G [1 : A^] \ -D, generate the codebooks following the steps below. 

• Consider a random mapping 7^ from Wfc to [1 : 2"^*= «] x [1 : 2"^'=''] such that each uj^ G Wfc 
is mapped to -fkiwk) = (afc(wfc), /3fc(wfc)), where afc(wfc) and /3fc(wfc) are uniformly and 
independently chosen from [1 : 2"'''=''] and [1 : 2™'^'='^], respectively. 

• Generate 2"''*^'" independent codewords u^{ak) for a^ G [1 : 2"'^'='"], of length n, according 

tonr=iPK,i)- 

• For each ak E [1 : 2"''*^'"], generate 2"^'='' conditionally independent codewords x]^{l3k\ak) 
for /3fc G [1 : 2'"''=^''], of length n, according to ]J^=iPixk,i\uk,i{ak)). 

• For each ak E [I : 2"'''="], generate 2"^"fe" conditionally independent codewords yn^ivnjak) 
for t^^ G [1 : 2"''"fe^"], of length ra, according to lYLiPiVrikA^kA^^k))- 

. Let xl{wk) denote x^(/3fc|«fc), where (afc,/3fc) = Ikiwk) for Wfc G Wfc. 
The codebooks are revealed to all parties. 

2) Encoding at the source: For a message wi E Wi, the source sends x^{{wi). 

3) Processing at node A; G [2 : A^] such that k = Up^: Node k operates following the steps 
below. 

• Find a unique Op^. such that 

If there is no such api^, randomly pick ap^ E [1 : 2"''^^°]. 

• Seek for a Vk such that 

iup^i^p,),yk,ykivk\apj) E T,,. 

If there are more than one such indices, randomly choose one among them. If there is no 
such Vk, randomly pick f;^ G [1 : 2™'''''']. 
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10 

. Let Wfc = {ap^,Vk). 

• If Zfc 7^ 0, node k sends x'^{wk)- 

4) Processing at node k E [2 : N] such that k E Mp^^: Node k operates following the steps 
below. 

• Find a unique (ap^,/?^^.) such that 

If there is no such ((5^^^,^^^), randomly pick {ap^yPpJ E [1 : 2"''pfc'''] x [1 : 2'^''pk''']. 

• Let w}fc = (apfe,/3pj- 

• If Zfc 7^ 0, node fc sends x^{wk)- 

5) Decoding at the destinations: The rf-th destination for d G [1 : K] decodes the message 
following the steps below. 

• Construct a subset Fk^ of Wfc for every k E\i : N] m the following way. For k E Dd, let 
Fk,d = {wk}- For k ^ Gd, let Fk,d = Wfc. For all the other k's, i.e., k E Gd\ Dd, Fk/s are 
constructed recursively as 

Fk,d = {wk\{uk{ak{wk)),xl{(3k{wk)\ak{wk)),yn^ivnjakiwk))) E T,„, 

{ak{wk),VnJ E Fn^^d, {ak{wk),f3k{wk)) E Fj^d for all j E Mk for some t^ E [1 : 2"^"fe-"]}. 

• Find a unique wi^d G Fi^d- If there is no such wi^d, randomly pick wi^d G VVi- The destination 
declares that wi^d was sent. 

6) Analysis of the probability of error: We analyze the probability of error for message Wi 
averaged over the codebook ensemble. Let Wk denote the chosen index at node A; for A; G [2 : N] 
and let V„j. denote the chosen covering index at node Uk for k E [1 : N]\D. Let us first introduce 
the notion of a supporting rate. 

Definition 1: For our coding scheme, T^ for A; G [1 : A^] is said to support a rate r^ or have 
a supporting rate r^ for destination d E [1 : K] if, for any e > 0, 

/^S = nWk i Fk.d) < e 

^tl = nw'k e Fk,d) < 2-"(^-^) 
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for w'^ ^ Wk for sufficiently small e' and e" and sufficiently large n[j Note that the supremum 
of the supporting rate of T^ for destination d E [1 : K] becomes infinity and zero when k E D^ 
and k ^ Gd, respectively. 

The following lemma shows that i? < ri is achievable if T = Ti supports a rate ri for all 
destinations. 

Lemma 1: If T = Ti supports a rate ri for all destinations, R < ri is achievable. 

Proof: Fix e > 0. If T supports a rate ri for all destinations, the average probability of 
error using our coding scheme is upper-bounded as 



pH = p 1^^^ ^ ]y^ for some d E [1 : K]\ 



de[l:K] 

< A^(e + 2-"(^-i-^-^^) (7) 

for sufficiently large n. Note that Q is upper-bounded by (K + l)e for sufficiently large n if 
R < ri — €. Thus, R < ri is achievable. ■ 

Now, let us derive a sufficient condition for a supporting rate ri of T for all destinations using 
the following lemma. The proof is at the end of this section. 

Lemma 2: Consider d E [1 : K] and k E Gd\ D^,. If Tj for j E Z^ supports a rate Vj for 
destination d, Tk supports a rate r^ for destination d such that 

n<I{Uk;Y^J + H{Xk\Uk) (8a) 

rk< Yl r, + I{Uk;Yr,J+I{Xk-X,\Uk) (8b) 

jeAhnGa 

rk< Yl ^^-^(Yn,-X,\Uk,Xk). (8c) 

To get a bound on the supporting rate ri of T for destination d G [1 : K] using Lemma |2l 
we apply the Fourier-Motzkin elimination to the set of inequalities ([8]) for all k E Gd\ Dd^y 
removing all the other r^'s, i.e., k E Gd\ Dd\ {l}a The resultant inequalities of ri can be 

'P(iy^ G -Ffe.d) for all w^ 7^ W^^ are the same due to the symmetry of the codebook generation. 
^Note that r^ for k £ Dd is given by infinity. 
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12 



written as the min-cut form 

ri<min J^ I{Uk\Y^,) + H{Xu\Uu) + Yl HUk;Y^J + I{X,-Y^jUk 



Sd 



k€Bs^,d 



- J2 Hyn,-X,\Uk,x,) 

where the minimization is over all cuts Sd considered in Theorem [T] Here, each cut Sd cor- 
responds to the set of inequalities that results in an inequality of ri in the Fourier-Motzkin 
elimination, i.e., the set of inequalities consists of (l8al) for k e As^^, (I8b1 ) for k G Bs^^d, and 
dlcl) for A; G Cs.^d- 

For all destinations, we obtain the following sufficient condition for a supporting rate ri. 

k&As^,d k&Bs^^d 

- J2 i{yn,-Xju,,x,) (9) 

From Lemma [H all rates less than the right-hand side of ^ are achievable. By considering 
all joint distributions of ([5]), the lower bound in Theorem [1] is proved. 

Proof of Lemma^ Fix d E [1 : K] and k G Gd\Dd- Fix any e > 0. Without loss of generality, 
assume that Wk = (1,1) and 7^(1, 1) = (1,1). First, /i^"j is upper-bounded as 



<F IE1UE2UE3UE4UE1UE2U y E^j] 



<P(^l)+P(E2)+P(^3)+P(^4) 



P(Ei n El) + P(^2|^2' n ^3') + Yl P(^3jl^4) (10) 
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where the events are defined as 

E^ = {{U-{l),X^{l\l),Y:SVnJl)) i T,,} 

i?2 = {(i,v;j^Fn„4 

E3, = {(l,l)^F,-4 for J GMfc 
^1 = {(f/,"(l),lT^,y„"^KJl)) i T, for all vn, G [1 : 2"-.-]} 

E2 = {(f/,"(i),i;"j^T,,} 

^3 = {(f/n«fe),>;") e T,, for some a,. ^^ l} 

^4 = {X'^mau) = Xni|l) for some (a,., /3fc) ^ (1, 1)} . 

Note that El implies that (f/,"(l), F," , F^'^lKjl)) G T,, E^ n E| implies that W^, = (1, KJ, 
and E^ implies that Wj = (1,1) for all j G Mk. Let us upper bound each term in the right-hand 
side of W. 

• If r„^. 1, > I{Yn^,;YnjUk) + S{e')Q we have P(-E'i) < e for sufficiently large ra from the 
covering lemma [|T4]| . 

• By the law of large numbers, we have P(-E'2) < e for sufficiently large n. 

• If ^k^a < I{Uk] ^nfe) — <^(e')' we have P(-E'3) < e for sufficiently large n from the packing 
lemma [[141. 

• If rfc,a + rfc,b < i/(Xfc) - 5{e') and r^,;, < H{Xk\Uk) - 5{e'), we have P(^4) < e for 
sufficiently large n. 

• We have 

P(^i n El) 

= p{(f/ni),x,"(i|i),t:(KJi)) ^T,,,,(f/,"(l),r„"^,F„"^(Kjl)) G 

< E pK,<,ynjnK(i),^ni|i),yn,(v;ji)) ^ T,H«L<,y;:j 

(a) 

< e 

for sufficiently large n, where (a) is from the conditional typicality lemma [fT4|. 

^Here and from now on, 5(e') — >■ as e' — >■ 0. 
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14 



We have F{E2\E2 n £"3) = n]^^j^ < e for sufficiently large n. 

.in) 



(n) 

. We have E^eA/, P(^3il^4) = 'EjeM, t^'Hd < e for sufficiently large n. 
Let us choose r^^a, ^fc,6 and r„^^„ as 

n,b = H{Xk\Uk)~26ie') 

For the above choice of rk,a,rk,b, and r„^^ t,, we have /i^"^ < 7e for sufficiently large n. 
Now, consider w'f. 7^ (1,1). ^'^"j is upper-bounded as 

< P(E4UE5UE6) 

< P{E2 U E3 U E4 U E4 U E5 U Eg) 

< P(^2) + P(^3) + P(^4) + P(£4) + P(£5 n E^ n ^3= n E^^) + p(E6 n ^2^ n E^ n £4=) 

< 3e + P(E4) + P(E5 n E^ n E^ n E^) + f{Eq nE^nEin El) (ii) 

for sufficiently large n, where the events are given as 

i?4 = {7fcK) = (l,l)} 

E, = iikiwi) = (l,/3,),(t/r(i),^n/3fc|i),t"(KJi)) e T,.,(l,v;j e F„„,, 

(1, /3k) G Fj- d for all j e Mk for some /3k ^ 1} 
^6 = ilkiw'k) = K,/3fc), ([/n«fc),Xfc"(/3fc|afe),y;';KJafc)) G T,», K,t;„J e F„,,d, 

(afc, /3k) e Fj,d for all j G M^ for some (a/,, /3fc) 7^ (1, 1) and (a^, w„J 7^ (1, KJ}. 

Let us upper bound each term in the right-hand side of (fTTI) . 
• P(£'4) is given as 
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We have 



(a) 



for sufficiently large n, where (a) is because 

for /3fc 7^ 1 from the joint typicality lemma lfT4l . 
We get 



p{EenE^nE^nE: 



A) 



< 



KA)^(i,_i) 

for sufficiently large n, where (a) is from the joint typicality lemma lfT4l . 
Note that Vj = for j ^ Gd- Thus, we have 

(n) 

for sufficiently small e' and e" and sufficiently large n. 

V. Upper Bound 

Fix de[l:K]. Let U^, ^ {X^,+„ Y^^') and Yn,, = lT„,nD, for A; G [1 : N] and z G [1 : n]. 
Note that 

P \^k,i) -^kA) yn^.i) ynif,i) P y-^k,i) •^k,ij P \yni^,i\-^k,i) P l^2/nj;,i|^fc,j) ynf^,i) 
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for k E [1 : N] and i E [1 : n]. Consider a cut Sd considered in Theorem [TJ 
Let us first present two lemmas and a corollary. 
Lemma 3: For A; G [1 : A^], the following inequalities and equality hold. 

n 

Y, I{Uk,\ Y^^,) + H{Xu,\Uk,) - H{Xl) > 

i=l 
n 

J2 HUk,; Y^,,) + I{X,,- %,AU,,) - i{xi- Fr„^nDj > 

i=l 



j=l 
Lemma 4: The following inequalities hold. 

/(X- F" n. ) - iJ(X,") < 



(12a) 
(12b) 
(12c) 



for kE[l:N] (13a) 






ijHDd^ 



for A; G -B 



5d,d 



(13b) 



J(X-rr^n^J + J(F„-F4nDj^.")< E A^">T,nDj for fc G C^,,. (13c) 

ieZfeOGd 
The proofs of Lemmas [3] and |4] are in Appendices |B] and O respectively. From Lemmas [3] and 

in we have the following corollary. 

Corollary 3: We have 

n 

n 71 

Proof: We have 

n 

/(x-F^j<j(xr;y^j + Ev^w+ E E^(f^^'-^«-^) + ^(^'^'^i^'^-.^) 



keSd 



k&As^,d *=i 



+ E E^(^'^.*'^-^.*)+^(^^'*'^"^.*i^'^.*)- E E^(^«^'*'^-^'*i^*^'^'^' 

k<^Bs^.d i=l '^^eCs^.d i=l 

from Lemma m where il){k) for A; G 5*^ is defined as 

^-i7(X-) if A: G As,,, 

^(/c)^<i-/(X-y4n^J if A: Gi?5„d • 



fc,jj 
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Now, it remains to show 

/(X- r^J + J2 ^(k) < 0. (14) 

From Lemma m we have 

/(X«; IT^n^J + ^(A;) < Y. A^^'^lVJ (15) 

jeZk.nSd 

for k e Sd- Using the inequality (fTSi) recursively for all k e Sd starting from k = 1, the 
inequality (fT4l) is proved from the fact that node k at the boundary of Sd is included in As^,d 
and Zfc n Sd = ^ for ke As,^. U 

Now, we are ready to prove the upper bound in Theorem \T\ In the following, e„ tends to zero 
as n tends to infinity. We have 

nR = H{X^) 

= I{X^;Y-J + H{X-\YSJ 

<I{X^;YS)+nen 



Jd' 

n 



<ne„+ 2^ 2^IiUk,i■,Yn,,i) + H{Xk,^\Uk,i) 



keAs^.d *=1 



+ 2_^ 7 ^I{Uk,u ^rtfc.t) + I{Xk^i] Yn,,^i\Uk,i) 
keBsj,d «=1 
n 

/ _, / J -'■ \^ nk,i^ ^ nk,i\ k,ii ^k,i) 
k&Cs^,d «=1 

where (a) is due to Fano's inequaility and (6) is from Corollary [3l 

Let Q denote a time-sharing random variable uniformly distributed over [1 : n] that is 
independent of all the other variables. Define random variables (t/^, X^, K„^, Y^J for A; G [1 : X] 
such that 

p (y'k = Uk, Xk = Xk, Yn^ = y^,, ^4 = y^kiQ = ^) 

P \^k,i Uky y^k,i Xky -'«(.,« Vnky ^nj^.i ^nj, I 
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n 

i=l 



for ie[l:n]. Let Uk = (f/^, Q) and F„, = (F^, Q) for k e [1 : N]. Then, we have 

1 " 

<J(f/fc;r„J + i/(Xfc|[/fc), 
</(t/,;F„J + J(Xfc;y;jf/fc), 



n 

j=i 



and 



1 " . 



n . 
1=1 



Hence, we get 



+ J2 HUk;YnJ + I{Xk-XjUk 



k&Bs^,d 

- J2 nyn,-X,\u,,Xk). (16) 

Note that only the marginal distributions p{uk,Xk,ynk,ynky^ for k E [1 : N] are needed to 
evaluate the right-hand side of (fT6l) . Thus, we do not lose generality when we only consider 
the joint distribution of ([5]). Since the definition of F„^ for k E [1 : N] depends on D^'s for 
d E [I : K], the minimization over d E [I : K] has to be outside the maximization over 
Y[ke\i-N]Piynk\'^k,ynk)^ which results in the upper bound Q. The cardinality bound Q for Uk 
and y^ik for k E [1 : N] can be obtained in a similar way as in IfTSl . 

VL Diamond Netvv'Orks 

In this section, we present an alternative capacity expression for a simple tree network with a 
single destination, called a diamond network, in which the root node has one noisy child node 
and one noiseless child node, each node at the second level has a single noiseless child node, 

October 19, 2011 DRAFT 



19 

and nodes at the third level form the destination. In the following, nodes 1, 2, and 3 are the 
source, noisy relay, and noiseless relay, respectively. 

The capacity of diamond networks was first characterized by Kang and Ulukus [fT2|. 

Theorem 2 (Kang and Ulukus /I72l/].- The capacity of diamond networks is given as 

max min{/(f/i; Y2) + H{Xi\U,), r2 + r^ - I{Y2; Y^lUuXi)} (17) 

p{ui,xi)p{y2\y2,ui): 

r2>HY2;Y2\Ui,Xi) 

rs>H{Xi\Ui,Y2) 

with cardinalities of alphabets bounded by 

|Wi|< 1-^11+4 (18a) 

l>'2|< \Ui\\y2\ + 2<\x,\\y2\ + A\y2\ + 2. (i8b) 

Now, the following theorem shows an alternative capacity expression for diamond networks, 
whose proof is in Appendix |Dl 

Theorem 3 (Alternative expression): The capacity of diamond networks is given as 

max I{Ui-Y2) + H{Xi\Ui) (19) 

p{ui,xi)p{y2\y2,ui): 

r3>HiXi\Ui,Y2) 

r2+r3>I{Uv,Y2)+H(Xi\Ui)+I{Y2;Y2\Ui,Xi) 



with cardinalities of alphabets bounded by (1181) . 

Theorem [3] shows that we do not lose optimality when the codebook construction of the combi- 
nation of DF and CF is restricted to the superposition of 2"(^(^i'^2)-e) 'cloud centers' f/", i.e., 
the part of the message decoded by the noisy relay, and 2"(^("^il^i)~'') 'satellites' X" for each 
f/[', i.e., the remaining part of the message. This means that the optimality of the combination 
of DF and CF at the noisy relay in diamond networks intuitively makes sense since the relay 
compresses a noisy observation of almost uncoded information that has no structure. Otherwise, 
the optimality of compression after decoding at the noisy relay, which ignores the codebook 
structure at the source, would have been counterintuitive. 

On the other hand. Theorem [U gives the following min-cut capacity expression for diamond 
networks with cardinalities of alphabets bounded by (fTSi) . 



max min{J(f/i; Y2) + H{Xi\Ui),rs + /(f/i; Y2) + /(Xi; Y2\Ui] 

p{ui,xi)p(y2\y2,ui) 



r2 + rs-IiY2;Y2\U,,X,)} (20) 
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We note that the relationship between the two capacity characterizations (fT9l ) and (|20l) is similar 
to that between the two equivalent achievable rate characterizations of CF for 3-node relay 
networks in [3| and |fT6ll . which are given by (12T]) and (l22l) . respectively. Here, node indices 
follow the convention that nodes 1, 2, and 3 are the source, relay, and destination, respectively. 

max I{Xv,%Y^\X2) (21) 

p{xi)p(x2)p{y2\y2,x2)- 

I{X2\Y3)>I(Y2;Y2\X2,Y3) 

max mm{I{X^-Y2,Y^\X2)J{Xr,X2^Y^) - I{Y2-,%\Xr,X2,Y^)} (22) 

p{x-i)p{x2)p{y2\y2,x2) 

VII. Conclusion 

We characterized the capacity of a class of multicast tree networks having an arbitrary number 
of nodes, which includes the class of diamond networks studied in [fT2]| as a special case. For 
achievability, we constructed a robust coding scheme that uses a combination of DF and CF in 
every noisy relay and a random binning in every noiseless relay in a way that the codebook 
constructions and relay operations are independent for each node. For converse, we used a 
novel technique of iteratively manipulating inequalities exploiting the tree topology. For diamond 
networks, we showed that the optimality of the combination of DF and CF at the noisy relay is 
intuitively convincing by proving that it does not lose optimality to restrict the coding scheme 
such that what is compressed after decoding at the noisy relay is a noisy observation of almost 
uncoded information. 

Appendix A 
Proof of Corollary [H 

Let Cp^^k — niaxp(^. ) I{Xp^; y^) for k E [2 : N] denote the point-to-point capacity between 
nodes pk and k and let C(/i;, li) for (i e [1 : K] and k such that Lk ^ Dd denote the capacity of 
tree network Tk with a source k and a single destination Dd- For a lower bound on the right-hand 
side of ([B, let us choose the joint distribution Ylkeii-mPi^k, Xk)p{ynJuk,ynJ as follows: 

• For k such that k G T^^ for some d E [1 : K], choose p{uk, Xk)p{ynjuk, ynk) that achieves 
C{ad,d). 

• For k such that k ^ Ta^ for all d E [1 : K] and Uk 7^ 0, choose p{xk) that achieves Ckn^ 
and let Uk = X^ and F„, = 0. 
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• For k such that k ^ T^^ for all d e [1 : K] and n^ = 0, let X^ uniformly distributed over 
Xk and let Uk = Y^, = 0. 
For the above choice of distribution, we obtain the following lower bound. 

C>niin< min minCfc,, min C(ad,d) 



:min< min min min Cki, min C(ad,d)> 
{de[i:K]keGanTs^jeZknGa d(^[i:K] ' ] 



= min min < min min Cuj, C(ad, d) 

de[l:K] [keGanTs^jdZknGa ■' 

= min C(l,d). 

deli-.K] 

Now, note that the right-hand side of dU is clearly upper-bounded by minde[i:i^] C(l, d). Hence, 
the lower and upper bounds in Theorem \T\ coincide. ■ 

Appendix B 
Proof of Lemma [3] 

Consider k E [1 : A^]. We have 

n 

i=l 
n 

j=i 

n 

= 2^ H^k,i+l^ ^Uk ' ^rik^i) - H^k,i+l'': ynkA^nk ) + H{^k,i\^k,i+1^ ^n^ ) + H^k,i] ^4 \^k,i-i 
i=l 
n 

= / Ji^kA+l^yL ;^nfc,«) +-^(-^fc,i|-^fc,J+l5^nfc ) 
i=l 
n 

where (a) is from Csiszar sum identity [17J, which proves (I12al) . 
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We have 

n 



i=l 
n 



i=l 

n 

= Y,H{Xk^Uk,X,,i)- (23) 



Note that combining ([23]) with (|12ab proves (|12bb . 
We have 



r/'v" ■ V" I V"^ — \ ^ T(V ■ V" I "V"" V^~^\ 

i=\ 

n 

i=l 
n 



where (a) is from the following Markov chains: 






which proves (|12cl ). 



Appendix C 
Proof of Lemma |4] 

For k G [1 : N], the inequality (|13a|) holds trivially. 
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For k e -Bs . d, we have 



Tf y^- v" \ ^ T( y^- V" \ — T( v^- V" I v" 



E ^(^^"'^v 



{d) 



ieMfeHGd 
where (a) is from the Markov chain 

^LfeHL^^nDd ^ -'^fc ^ Yir^^nD^y (24) 

(6) is from the Markov chain 

for j G Mfc, (c) is because L^ r\Dd = % for j ^ G^, and (ci) is from the following Markov chain 

X- o X; ^ Y-^^j,^ (26) 

for j E Mk n Gd- Note that (|26|) holds since j E M^ fl Gd is not a leaf node from the definition 
of Bs^i,d- Thus, (|13b| ) is proved. 
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For k e Cs^,d, we get 

(fy r/'v" ■ V" "i -L 7'/' IT"- V" IV" ^ 

ieMfe 

(c) 

where (a) is from the Markov chain 

Y^ ^4. V" ^-4. V" 

(b) is from the Markov chains (l24l) and (l25l) . (c) is because Lj fl -D^ = for j ^ G^, and (d) is 
from the Markov chains (|26l ) and 

^rifcOGd ^ ^nfcn&'d ^ '^Lnj^nDd- (27) 

Note that (l26l) and (ITTI) hold since nkCiGd and j G MkdGd are not leaf nodes from the definition 
of C*5d,d- Thus, (I13cl) is proved. ■ 

Appendix D 
Proof of Theorem [3] 

Let us note that the constraint on r2 in (flTl) can be easily verified to be redundant. Fix r2 and 
rs. Let Ri and i?2 denote (flTI) without the constraint on r2 and (fT9l) . respectively. It is trivial to 
show i?2 < -Ri- To show Ri < R2, it is enough to show that for all p{ui,xi)p{y2\ui,y2) such 
that R < liUi; Y2) + HiXilUi) and rg > i/(Xi|t/i, F2), where R^r2+r-i- /(K,; :^2|f/i, ^i), 



October 19, 2011 DRAFT 



25 

there exists p{ul,xl)p{y2*\ul, 1/2) that satisfies 

R = I{Ul- Y2) + H{Xl\Ul), (28a) 

R<r2 + r^- I{Y2-S2\UIXI), (28b) 

r^>H{Xl\UlY;). (28c) 

Now, consider a joint distribution of p{ui, Xi)p{y2\ui, 2/2) such that R < /(f/i; I2) + H{Xi\Ui) 
and ra > _ff(Xi|t/i, l^). Let B denote a Bernoulli random variable with parameter A G [0, 1]. 
Let iUl.X'l.Y^) and {U'{' , X'{' .Y^') denote the triplets of random variables given as 

|(f/i,Xi,F2) if 5 = 1 . 1(0,0,0) iffi = l 

[(Xi,Xi,0) iffi = [(Xi,Xi,0) iffi = 

We will show the existence of p(m^, x\)p{y2*\u\, 2/2) that satisfies (l28l) separately for the cases of 
R > /(Xi; F2) and R < I{Xi; Y2). First, consider the case of i? > I{Xi; Y2). Let U^ = {U'l, B), 
X* = X';, and Y* = {Y^', B). Note that /(f/*; Y2) + iJ(X*|f/*) is a continuous function of A 
and becomes /(f/i; I2) + -f^(-'^ilf^i) and /(Xi; Y2) when A = 1 and A = 0, respectively. From 
the intermediate value theorem, there exists A G [0, 1] such that R = I{Ul;Y2) + H{Xl\Ul). 
Furthermore, (I28bl) and (I28cl) are satisfied from 

1(^2; v'2*if/r,x*) = /(F2;nw(',^r,i?) 

= A/(F2;^2|t/i,Xi) 
<I{Y2;Y2\Ui,X,) 

and 

if(X*|t/*,F;) = i7(X('|t/{',F2",5) 
= XH{X,\UuY2) 
<H{X,\U,,Y2), 

respectively. 

Next, consider the case of i? < J(Xi; Y2). Let f/* = (t/f , 5), X* = Xf , and F2* = (^2", B). 
Note that I{U]^]Y2) + if(Xi'|f/j') is a continuous function of A and becomes and /(Xi;F2) 
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when A = 1 and A = 0, respectively. From the intermediate value theorem, there exists A G [0, 1] 
such that R = I{U^] Y2) + H{X*\U^). Furthermore, (l28bl) and (l28cl) are satisfied from 

/(F2; Y*\U:, X*) = I{Y,; yflf/f, X(", B) = 

and 

H{X*\U*,Y*) = H{X';'\U'^',Y^',B) = 0, 

respectively. ■ 
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